RNA structural probing of guanine and uracil nucleotides in yeast

RNA structure can be essential for its cellular function. Therefore, methods to investigate the structure of RNA in vivo are of great importance for understanding the role of cellular RNAs. RNA structure probing is an indirect method to asess the three-dimensional structure of RNA by analyzing the reactivity of different nucleotides to chemical modifications. Dimethyl sulfate (DMS) is a well-established compound that reports on base pairing context of adenine (A) and cytidine (C) in-vitro and in-vivo, but is not reactive to guanine (G) or uracil (U). Recently, new compounds were used to modify Gs and Us in plant, bacteria, and human cells. To complement the scope of RNA structural probing by chemical modifications in the model organism yeast, we analyze the effectiveness of guanine modification by the glyoxal family in Saccharomyces cerevisiae and Candida albicans. We show that within glyoxal family of compounds, phenylglyoxal (PGO) is the best guanine probe for structural probing in S. cerevisiae and C. albicans. Further, we show that PGO treatment does not affect the processing of different RNA species in the cell and is not toxic for the cells under the conditions we have established for RNA structural probing. We also explore the effectiveness of uracil modification by Cyclohexyl-3-(2-Morpholinoethyl) Carbodiimide metho-p-Toluenesulfonate (CMCT) in vivo and demonstrate that uracils can be modified by CMCT in S. cerevisiae in vivo. Our results provide the conditions for in vivo probing the reactivity of guanine and uracil nucleotides in RNA structures in yeast and offer a valuable tool for studying RNA structure and function in two widely used yeast model systems.


Introduction
As RNAs can be both single-and double-stranded and are highly flexible, they can adopt diverse secondary and tertiary structures in physiological conditions. An example of simple RNA three-dimensional structures includes base-paired doubled-stranded areas such as hairpin stems. More complex RNA motifs comprise three-dimensional structures such as ribose zippers, kink turns, and pseudoknots [1]. The versatility of RNAs in forming simple and intricate three-dimensional (3D) structures allows RNAs to perform critical cellular functions, including catalysis and ligand binding, and promotes RNA-RNA and RNA-protein interactions. Understanding the 3D structures of RNAs is essential for revealing RNA functions in a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 gene expression. An example of this is when RNA structure probing was used to show that a mutation in the ribosomal protein L10 affects the rotational state of the ribosome, which in turn impacts the fidelity of the translation machinery [2]. However, the diverse range of RNA functions happens in the active forms of RNA in vivo. Researchers, therefore, face the challenge of probing a vast array of RNA structures in vivo to elucidate the functional roles of RNAs in cellular processes [3].
RNA structure-function relationships inside the cell are affected by the transcription rate, the local solution environment, and the presence of small molecules or RNA-protein interactions. The physical state of RNA in vivo can provide insight into the function of the RNA [4]. Structural biology approaches, including X-ray crystallography, NMR spectroscopy, and single-particle cryo-electron microscopy, have shaped much of our understanding of the 3D structures of many RNAs. While providing us with the atomic resolution structure of the RNA molecules, these techniques require highly pure samples, specialized equipment and infrastructure and are time consuming. Furthermore, these approaches to study the RNA structures in vitro may not provide a comprehensive view of the conformations that the RNA molecules adopt in their native environment [5]. Hence, indirect techniques have been developed as an effective alternative to structural approaches to study the RNA structure in vivo and in vitro.
RNA chemical probing is the fastest way to indirectly investigate RNA structures by using chemical reagents to modify nucleotides at specific positions and analyze the modification efficiency of each site. RNA structure probing techniques can target ribose sugars or nitrogen bases. Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) uses an electrophilic carbonyl derivative that can be attacked by the strong nucleophile 2'-hydroxyl group on the ribose sugar [6]. Modifications in the RNA backbone result in stops during the reverse transcription (RT). Thus, RT stop sites reveal the modification adduct locations. Quantifying these stops by denaturing UREA polyacrylamide gel electrophoresis (UREA-PAGE) or deepsequencing provides a measure to assess the accessibility and/or conformation of each nucleotide [7]. While SHAPE provides information about the rigidity of the RNA backbone, it does not directly report on the nucleotide base-pairing. Base-specific modification is an alternative, complementary approachfor RNA structural probing. Dimethyl sulfate (DMS) is used as a base-specific probing reagent against the N1 of adenines and N3 of cytosines that are not involved in a base pairing or hydrogen bonding [8]. DMS can also methylate guanosine at position N7, away from the Watson/Crick face [9,10]. DMS is cell-permeable because of its small size. Therefore, DMS modification reactions readily occur in nearly all in vivo conditions without additional permeabilization [8]. However, DMS is unable to probe uracils or guanines on the Watson/Crick face. The lack of reliable chemical probes for guanines and uracils limits the potential of RNA structure probing approaches to assess RNA conformations. Thus, new families of chemicals have been explored to probe guanines and uracils in vivo.
In the glyoxal family, glyoxal (GO), methylglyoxal (MGO), and phenylglyoxal (PGO) are potential candidates for probing guanines. These chemical probes are carbonyl derivates that are electrophilic towards the nucleophilic amidine group in adenine, cytosine, and guanine. As uracil lacks this functional group, it does not react with glyoxal family compounds. The pK a of the imino group of the amidine moiety of adenines and cytosine (N1 of adenine or N3 of cytosine) is much lower than that of guanine, making adenine and cytosine less reactive nucleotides with glyoxal family compounds than guanine [11]. Glyoxal family compounds were shown to be effective guanine probing agents in the eukaryotic model Oryza sativa and gramnegative bacteria Bacillus subtilis and Escherichia coli in vivo [11]. The use of glyoxal and its derivatives has not been established for in vivo RNA structure probing in other model systems.
Saccharomyces cerevisiae and Candida albicans are two well-studied fungi that also serve as model organisms for studies of conserved biological processes. S. cerevisiae is a simple eukaryotic organism whose RNA biology shares many features with higher eukaryotes, making it a suitable model organism to study different aspects of gene expression [14]. C. albicans is an opportunistic fungal pathogen and the most prevalent cause of fungal infections [15]. While there are established procedures for in vivo probing of the adenine and cytosine positions within yeast RNAs using DMS [8,9,16], guanines and uracils have so far escaped in vivo probing in yeast cells. In this work, we test the application of glyoxal and carbodiimide derivatives for RNA structure probing in the widely used budding yeast model system Saccharomyces cerevisiae and in the human fungal pathogen Candida albicans. We compare three different glyoxal derivatives and show that PGO yields the highest modification rate of guanines in yeast without affecting cell viability or RNA processing. We also present the conditions for in vivo modification of uracils in yeast using CMCT.

Materials and methods
Yeast cell culture BY4741 strain Saccharomyces cerevisiae and BWP17 strain of Candida albicans were grown in YPD media at 30˚C to an optical density of 0.5-0.6 before applying the desired chemical probes.

Chemical treatment
GO and MGO were purchased as water-based solutions. PGO and CMCT were solubilized in DMSO and water, respectively. S. cerevisiae and C. albicans were incubated with glyoxal family compounds (GO, MGO, PGO). S. cerevisiae was treated with CMCT. In each case, the pH of the media was around 7.5. For each compound, three different concentrations were tested. A no-compound solvent-treated sample was used as the negative control. Two incubation times (5 and 15 min) were tested for all conditions, and two biological replicates were analyzed. GO concentrations were 30 mM, 60 mM, and 120 mM. MGO and PGO concentrations were 5 mM, 10 mM, and 20 mM. CMCT concentrations were 25 mM, 50 mM, and 100 mM. All incubations were performed at 30˚C. After incubation, samples were cooled down immediately on ice, and cells were harvested by centrifugation. Cells were washed three times with ice-cold water before RNA extraction.

Total RNA extraction and purification
Total RNA was extracted from cell pellets harvested from 10 mL of culture grown to mid-log phase using hot acid phenol. The extracted RNA was treated with DNase I for 15 minutes at 37˚C and further purified using a Quick-RNA miniprep kit (Zymo Research) according to the manufacturer's protocol.

Reverse Transcription (RT)
Approximately 1 μg of purified RNA in 10 μL was mixed with 2 μL of 0.6 μM 32 P-labeled primers. Annealing was performed by incubation at 65˚C for 5 minutes followed by a gradual cool down at room temperature over 10 minutes and final incubation on ice. The reverse transcription was performed using SuperScript III (ThermoFisher) at 50˚C for 5 minutes per the manufacturer's manual using a primer annealing to the 3'-end of the 5.8S rRNA (AAATGACGCTCAAACAGGCATG). 1 μL of 4M NaOH was added before heating the RNA at 95˚C for 5 minutes to remove the RNA templates. The cDNA products were mixed with the formamide loading dye, heated at 95˚C for 5 minutes, and separated on a prewarmed 6% urea/acrylamide sequencing gel in 0.5X TBE buffer. The gel was dried and exposed to a phosphoscreen. Data were quantified using Image Lab Software (Biorad) and analyzed in GraphPad Prism 8.0.

Northern blot for snoRNA and tRNA
Total RNA from two biological replicates treated with PGO or CMCT was isolated using the hot phenol method. snoRNAs were separated on 8% acrylamide/urea gels, transferred to Hybond nylon membrane (GE Healthcare), and probed using 32 P-labeled DNA oligos against U3-3' (AAAGTGGTTAACTTGTCAG), tRNA Leu (GCATCTTACGATACCTG) and scR1 (ATCCCGGCCGCCTCCATCAC).

Serial dilution spot test for chemical toxicity assessment
S. cerevisiae was incubated with PGO as described above. The density of the culture was adjusted to a final concentration of 10 7 cells/mL, followed by four successive cascade dilutions in a 1:10 ratio. Dilutions were spotted onto YPD plates and grown at 25˚C, 30˚C, and 37˚C for 48 hours.

PGO is highly effective for probing guanine nucleotides in S. cerevisiae
To define the best glyoxal derivative for probing guanines in yeast, we tested different concentrations of glyoxal (GO), methylglyoxal (MGO), and phenylglyoxal (PGO) at two incubation times. The tested concentrations of GO and MGO were chosen based on their effect on yeast cell growth, as a proxy for their cell penetration [17]. Based on this analysis, 60 mM GO and 10 mM MGO resulted in * 50% growth rate reduction in S. cerevisiae cells. Therefore, we tested these concentrations as well as 0.5X and 2X of each (30, 60, and 120 mM GO, and 5, 10, and 20 mM MGO). PGO concentrations were chosen based on the effect on yeast mitochondrial ATP synthase [18,19], where 10 mM PGO greatly destabilized the F 1 -ATPase in S. cerevisiae. We, therefore, tested 5, 10 and 20 mM concentrations of PGO.
The 5.8S rRNA is a part of the large ribosomal subunit. Several positions on 5.8S rRNA are subject to glyoxal modification in rice 5.8S rRNA, including G82, G89 and G99 (equivalents of G78, G85 and G95 in S. cerevisiae) [11]. To probe the effectiveness of GO, MGO and PGO, we therefore studied the modification of the 5.8S rRNA of S. cerevisiae by these compounds (Fig  1). GO weakly modifies guanines in the 5.8S rRNA of S. cerevisiae in vivo, as evident from the weak modification of nucleotides at positions G78 and G85 (Fig 1A). At the 5 min time, GO modifications do not show a noticeable concentration-dependent relationship, as band intensity across all GO concentrations appears similar (Fig 1A and 1D). At the 15 min time, the band intensity for G78 and G85 increases as the GO concentration increases (Fig 1A and 1D). The reactivity of MGO towards guanines in the 5.8S rRNA of S. cerevisiae in vivo is weaker than that of GO (Fig 1B and 1D). G78 and G85 modifications are comparable across all MGO concentrations, as band intensity remains the same at each time point and at each MGO concentration (Fig 1B and 1D). In contrast, PGO demonstrates high reactivity towards guanine nucleotides in the 5.8S rRNA of S. cerevisiae in vivo (Fig 1C). Further, at G78 and G85, modifications by PGO appear to be concentration-dependent, as band intensity at both positions increases as the PGO concentration rises (Fig 1C and 1D).
Unlike in rice, PGO does not modify the G95 in S. cerevisiae 5.8S rRNA. To understand the reason for this, we analyzed the position of 5.8S rRNA G95 in the structure of yeast ribosomes [20]. In this structure, G95 is engulfed by the C-terminal tail of the ribosomal protein RPL37. Particularly, residues Gln79 and Ser82 of RPL37 sandwich the base of G95 and come in hydrogen-bonding distance with the N1 of G95, thereby stabilizing its protonated state. The protonated state of the N1 is not desirable for the initial electrophilic attack on one of the aldehyde carbons of glyoxal and its derivatives [11,21]. Thus, the sequestration of G95 in S. cerevisiae 5.8S rRNA by the C-terminal tail of RPL37 provides an explanation for the lack of reactivity of this nucleotide with PGO (S1 Fig). The high reactivity of PGO comes in part from its increased hydrophobicity conferred by the phenyl moeity in the molecule, which allows it to penetrate through the phospholipid membrane bilayer [11]. The hydrophobic phenyl group on PGO can also strengthen interactions between PGO and hydrophobic protein residues, orienting the electrophilic PGO carbonyl group in place for nucleophilic attack by the amidine of guanine [11], thus resulting in higher reactivity relative to GO.

PGO treatment does not cause RNA processing defects
A critical concern for in vivo RNA structural probing is whether the RNA metabolic pathways in the cell change upon treatment with the chemical probing reagents. We therefore assessed how different RNA processing pathways are affected by PGO treatment by looking at the processing of small nucleolar RNA U3 (U3 snoRNA) and a transfer RNA (leucine tRNA) using Northern blot analysis (Fig 2A). Total RNA extracted from yeast cells treated with various concentrations of PGO at two time points was used for the Northern blot analysis. We first analyzed the processing of U3 snoRNA, required for proper ribosome biogenesis. U3 is transcribed as a precursor with 5'-and 3'-extensions which need to be removed in order to form the mature U3 snoRNA [22]. We used probes against sequences within the 3' region of the U3 precursor or within the scR1 RNA as the loading control. The band intensities for U3-3' species remain the same relative to the loading control (scR1 RNA) even after the concentration of PGO increases, indicating that the treatment of yeast cells with PGO does not impact the processing pathway of U3 snoRNA (Fig 2B). As another control, we analyzed the processing pathway of the leucine tRNA (tRNA Leu ) which is transcribed as a precursor and undergoes processing before entering the translation pool [23]. We probed the mature and precursor tRNA using an oligo complementary to the mature part of the tRNA. Our analysis indicates that the levels of mature and precursor of tRNA Leu do not change in the presence of PGO relative to the loading control (scR1) (Fig 2B). These data indicate that the tRNA Leu processing pathway remains unchanged in the presence of PGO. Together, our data indicate that treatment of yeast cells with PGO under our established conditions for RNA chemical probing are unlikely to affect the processing of major RNA species in the cell.

Long treatment with high PGO concentration affects cell growth in S. cerevisiae
As there are other pathways which can affect different RNA molecules in the cell and are not studied here, we analyzed the overall fitness of yeast cells upon PGO treatment. A serial dilution spot test was conducted on PGO-treated S. cerevisiae (Fig 3). At concentrations below 20 mM, PGO does not cause toxicity for S. cerevisiae as judged by the similar size of individual colonies of treated cells compared to no PGO-treated cells (Fig 3). However, at the concentration of 20 mM and after 15 min incubation period, S. cerevisiae cells no longer withstand the toxicity of PGO, resulting in smaller colonies and slow growth (Fig 3). Although this can be due to the toxic effect of PGO on the mitochondrial ATP synthase [18,19], we cannot rule out the possible effects of PGO on other RNA processing pathways in yeast. Based on these data, 5 min incubation of S. cerevisiae cells with 20 mM PGO appears to be the optimal condition to achieve efficient nucleotide modification without affecting the cell viability.

PGO can be used to modify Gs in other yeasts
Next, we sought to determine whether the conditions for modifying guanines in S. cerevisiae are applicable to other fungi. To this end, we tested the modification of 5.8S rRNA of Candida albicans, a human fungal pathogen. Post-transcriptional regulation of gene expression is important for the pathogenicity of C. albicans [24]. Given the emergence of the human fungal pathogen C. albicans as a public health threat, it is important to develop tools to study gene expression in this organism. Therefore, establishing the effective condition for guanine probing in C. albicans is of great importance for future pharmaceutical and biochemical applications. Having already established that PGO is the best guanine probe in S. cerevisiae, we used PGO to probe the 5.8S rRNA in C. albicans. PGO shows effective modification reactivity towards G77 and G84 in C. albicans (equivalent to G78 and G85 in S. cerevisiae) (Fig 4A). However, PGO modification of these two nucleotides in C. albicans demonstrates weak band intensities overall, compared to that in S. cerevisiae, representing lower frequency of guanine modifications by PGO in C. albicans than that in S. cerevisiae (Figs 1D and 4B). This different can arise from differences in the local environment of the G77 in C. albicans 5.8S rRNA relative to its S. cerevisiae counterpart. An alternative explanation could be that overall PGO entry to C. albicans cells is less efficeint compared to S. cerevisiae. Nonetheless, these data suggest that PGO can be used for probing guanines in C. albicans.

CMCT can be used to modify uridines in S. cerevisiae
Uracil probing chemical reagents and their reaction conditions have yet to be established in yeast in vivo. CMCT, a carbodiimide derivative, has been used to probe Us in vitro [25]. However, use of CMCT does not seem feasible for in vivo modification of uridines due to the low membrane permeability of this compound [26]. EDC has been recently established for uracil probing in different organisms [12,13]. However, despite numerous efforts we could not extract RNA from EDC-treated yeast cells because addition of the compound resulted in severe precipitation in the media (data not shown). Therefore, we set out to establish the chemical conditions for use of CMCT for probing uracils in the 5.8S rRNA of S. cerevisiae. While EDC effectively modifies several positions on rice 5.8S rRNA [13], none of those sites are reactive with CMCT. However, CMCT demonstrates effective uracil modification in S. cerevisiae in vivo at positions U81 and U82 (Fig 5). The modification intensity is the highest at 100 mM CMCT, irrespective of the incubation time. CMCT is too large to effectively penetrate cell wall due to the presence of a quaternary ammonium ion that constitutes a positive charge in CMCT [12]. This can explain the need for the much higher concentrations of CMCT, compared to PGO, to achieve effective modification (Fig 5). CMCT can react with Gs, albeit to a lesser extent than Us [25]. However, we did not observe any reactivity with Gs in the probed 5.8S rRNA region (Fig 5A and data not shown).

Extended CMCT treatment causes RNA processing defects in S. cerevisiae
We next assessed the effect of CMCT treatment on the RNA processing pathways in S. cerevisiae using Northern blot (Fig 6A). The processing of the 3' end of U3 snoRNA is not affected in the presence of varying concentrations of CMCT after 5-minute treatment (Fig 6B). However, extended treatment with CMCT results in a decrease in the level of U3 precursor (Fig  6B). This effect is more pronounced for the pre-tRNA product, the level of which goes down dramatically upon treatment with various concentrations of CMCT after 15 minutes (Fig 6B). While a defect in the processing of RNA precursors usually results in the accumulation of the processing intermediates, a decrease in the levels of these intermediates represents either a global destabilization of the RNA species or a defect in the RNA transcription. Based on these data long incubation periods with CMCT (>5 min) should be avoided.
In summary, this work establishes the optimal chemical conditions in which PGO and CMCT can effectively probe guanine and uracil nucleotides, respectively, in yeast cells. We found that PGO is a potent probe within the glyoxal family derivatives to probe guanine in yeast in vivo. PGO incubation with yeast does not affect its RNA processing pathways, and at the PGO concentrations less than 20 mM and at less than 15-minute incubation period, yeast cells can withstand the toxic effect of PGO. CMCT can be used to probe uracil in yeast in vivo